Comparison of three data mining models for prediction of advanced schistosomiasis prognosis in the Hubei province
نویسندگان
چکیده
BACKGROUND In order to better assist medical professionals, this study aimed to develop and compare the performance of three models-a multivariate logistic regression (LR) model, an artificial neural network (ANN) model, and a decision tree (DT) model-to predict the prognosis of patients with advanced schistosomiasis residing in the Hubei province. METHODOLOGY/PRINCIPAL FINDINGS Schistosomiasis surveillance data were collected from a previous study based on a Hubei population sample including 4136 advanced schistosomiasis cases. The predictive models use LR, ANN, and DT methods. From each of the three groups, 70% of the cases (2896 cases) were used as training data for the predictive models. The remaining 30% of the cases (1240 cases) were used as validation groups for performance comparisons between the three models. Prediction performance was evaluated using area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy. Univariate analysis indicated that 16 risk factors were significantly associated with a patient's outcome of prognosis. In the training group, the mean AUC was 0.8276 for LR, 0.9267 for ANN, and 0.8229 for DT. In the validation group, the mean AUC was 0.8349 for LR, 0.8318 for ANN, and 0.8148 for DT. The three models yielded similar results in terms of accuracy, sensitivity, and specificity. CONCLUSIONS/SIGNIFICANCE Predictive models for advanced schistosomiasis prognosis, respectively using LR, ANN and DT models were proved to be effective approaches based on our dataset. The ANN model outperformed the LR and DT models in terms of AUC.
منابع مشابه
Comparison of Three Decision-Making Models in Differentiating Five Types of Heart Disease: A Case Study in Ghaem Sub-Specialty Hospital
Introduction: cardiovascular diseases are becoming the main cause of mortality and morbidity in most countries. This research goal was to predict the types of heart diseases for more accurate diagnosis by data mining and neural network technics. Method: This research was an applied-survey study and after data preprocessing, three approaches of neural network, decision making tree and Bayes simp...
متن کاملComparison of Three Decision-Making Models in Differentiating Five Types of Heart Disease: A Case Study in Ghaem Sub-Specialty Hospital
Introduction: cardiovascular diseases are becoming the main cause of mortality and morbidity in most countries. This research goal was to predict the types of heart diseases for more accurate diagnosis by data mining and neural network technics. Method: This research was an applied-survey study and after data preprocessing, three approaches of neural network, decision making tree and Bayes simp...
متن کاملThermal conductivity of Water-based nanofluids: Prediction and comparison of models using machine learning
Statistical methods, and especially machine learning, have been increasingly used in nanofluid modeling. This paper presents some of the interesting and applicable methods for thermal conductivity prediction and compares them with each other according to results and errors that are defined. The thermal conductivity of nanofluids increases with the volume fraction and temperature. Machine learni...
متن کاملThermal conductivity of Water-based nanofluids: Prediction and comparison of models using machine learning
Statistical methods, and especially machine learning, have been increasingly used in nanofluid modeling. This paper presents some of the interesting and applicable methods for thermal conductivity prediction and compares them with each other according to results and errors that are defined. The thermal conductivity of nanofluids increases with the volume fraction and temperature. Machine learni...
متن کاملMulti-Output Adaptive Neuro-Fuzzy Inference System for Prediction of Dissolved Metal Levels in Acid Rock Drainage: a Case Study
Pyrite oxidation, Acid Rock Drainage (ARD) generation, and associated release and transport of toxic metals are a major environmental concern for the mining industry. Estimation of the metal loading in ARD is a major task in developing an appropriate remediation strategy. In this study, an expert system, the Multi-Output Adaptive Neuro-Fuzzy Inference System (MANFIS), was used for estimation of...
متن کامل